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and University of Sydney 

The leading term in the normal approximation to the distribution 
of Student's t statistic is derived in a general setting, with the sole 
assumption being that the sampled distribution is in the domain of 
attraction of a normal law. The form of the leading term is shown to 
have its origin in the way in which extreme data influence properties 
of the Studentized sum. The leading-term approximation is used to 
give the exact rate of convergence in the central limit theorem up to 
order n~ 1//2 , where n denotes sample size. It is proved that the exact 
rate uniformly on the whole real line is identical to the exact rate on 
sets of just three points. Moreover, the exact rate is identical to that 
for the non-Studentized sum when the latter is normalized for scale 
using a truncated form of variance, but when the corresponding trun- 
cated centering constant is omitted. Examples of characterizations of 
convergence rates are also given. It is shown that, in some instances, 
their validity uniformly on the whole real line is equivalent to their 
validity on just two symmetric points. 



1. Introduction. The Studentized mean is an early example of one of the 
most common approaches to adaptive statistical inference, where a nuisance 
parameter is replaced by its estimator and the effect on inference carefully 
gauged. Initially, in the case of Student's t statistic, this was done under 
the assumption that the sampled distribution was normal, but later there 
developed a substantial literature, to which Gayen (1949, 1950, 1952) and 
Hyrenius (1950) were early contributors, on the effect of nonnormality on 
properties of the statistic. Wallace (1958), Bowman, Beauchamp and Shen- 
ton (1977) and Cressie (1980) have reviewed work in this area. Even in the 



Received March 2002; revised January 2003. 

AMS 2000 subject classifications. Primary 60F05; secondary 62E20. 

Key words and phrases. Berry-Esseen theorem, characterization of rate of convergence, 
domain of attraction, Edgeworth expansion, random norm, rate of convergence, self- 
normalized sum, Studentize. 

This is an electronic reprint of the original article published by the 
Institute of Mathematical Statistics in The Annals of Probability. 
2004, Vol. 32, No. 2, 1419-1437. This reprint differs from the original in 
pagination and typographic detail. 



1 



2 



P. HALL AND Q. WANG 



case of normal data, where tables of the exact distribution have long been 
readily available, the issue of convergence (to normality) of the distribution 
of the t statistic has been of both theoretical and practical interest for many 
years; see, for example, Anscombe (1950) and Gayen (1952). 

From a theoretical viewpoint the problem of determining exact conver- 
gence rates for the t statistic can be a particularly awkward one. Despite the 
statistic's simple representation in terms of the mean and mean of squares 
of independent data, its distribution is surprisingly difficult to approximate 
using methods for sums of independent random variables. The problem has, 
of course, long been solved under sufficiently severe moment conditions, but 
its treatment in more theoretically interesting cases, when its distribution 
is asymptotically normal but few other assumptions are made, is far from 
straightforward. 

In a major advance, Bentkus and Gotze (1996) gave bounds of general 
Berry-Esseen type for rates of convergence in the central limit theorem 
for Student's t statistic when the data are independent and identically 
distributed. See also Chibisov (1980, 1984) and Slavova (1985). Bentkus, 
Bloznelis and Gotze (1996) extended Bentkus and Gotze's arguments to 
nonidentically distributed summands. Hall (1987) had earlier established 
Edgeworth expansions under moment conditions that were no more severe 
than existence of the moments actually appearing in the expansions. See 
also van Zwet (1984), Friedrich (1989), Putter and van Zwet (1998), Ben- 
tkus, Gotze and van Zwet (1997), Wang and Jing (1999), Wang, Jing and 
Zhao (2000) and Bloznelis and Putter (1998, 2002). 

However, moment conditions, even finite variance, are not the main pre- 
requisite for convergence of the distribution of Student's t statistic. In partic- 
ular, Gine, Gotze and Mason (1997) showed that a necessary and sufficient 
condition for the Studentized mean to have a limiting standard normal dis- 
tribution is that the sampled distribution lie in the domain of attraction of 
the normal law. See also Logan, Mallows, Rice and Shepp (1973), Griffin 
and Mason (1991) and Egorov (1996). Although it is not of direct relevance 
to our work, we mention that the case where the data are from a time series 
is more complex. There, convergence in the conventional, deterministically 
normalized central limit theorem is not equivalent to convergence in the 
randomly normalized case; see Hahn and Zhang (1998). 

In the present paper we assume no more than that the sampled distribu- 
tion lies in the domain of attraction of the normal law, and describe rates 
of convergence, in the independent-data case, without reference to moment 
properties. We give the leading term in a normal approximation to the distri- 
bution of Student's t statistic, and show that its form is strongly influenced 
by the effects that large data have on the statistic. Using the leading term, 
we derive the exact convergence rate in the central limit theorem, up to 
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terms of order n" 1 / 2 (where n denotes sample size), or up to order n^ 1 
when the sampled distribution satisfies Cramer's continuity condition. 

We show that, if the third moment should happen to be finite, the leading 
term transforms into the conventional first term in an Edgeworth expansion 
of the distribution of Student's t statistic. More generally, however, the lead- 
ing term can be used to show that the exact rate of convergence over the 
whole real line is equivalent to the exact rate of convergence over very small 
sets, containing no more than three points. The number of points can be 
reduced to two if we seek necessary and sufficient characterizations of the 
convergence rate, rather than the exact rate itself. We draw connections to 
the rate of convergence of the distribution of a conventionally normalized, 
non-Studentized mean. 

2. Main results. Let X\, X2, ... be independent and identically distributed 
random variables, and let X have the distribution of a generic X{. Student's 
t statistic, with numerator centered at its expectation, is defined to be 

/ n \ I ( n / n \ 2 ~v 1/2 

m '-fe^/te^-is*)} ■ 

An alternative, more classical definition of the Studentized mean, in which 
the sample variance has divisor n — 1 rather than n, has the formula (1 — 
n~ 1 ) _1 / 2 T; see Gossett (1908). All our results hold for this version of Stu- 
dent's statistic, as well as that given by (2.1). The principal results are 
Theorems 2.1 and 2.2, which respectively describe the leading term and its 
role in a normal approximation to the distribution of T. Propositions 3.1 
and 3.2 in the next section reveal the origins of the leading term, and in 
particular link it to the way in which extremes affect the distribution of T. 

Write and <fi for the standard normal distribution and density functions, 
respectively. Put b n = supjz : nx~ 2 E[X 2 I(\X\ < x)] > 1} and 

(2.2) L n (x) = nE{$[x{\ + {X/bn) 2 } 1 ' 2 - (X/b n )} - $(*)). 

Theorem 2.1. // the distribution of X is in the domain of attraction 
of the normal law, and E(X) = 0, then 

(2.3) sup \P(T<x)-{<S>(x)+L n (x)}\=o(5 n ) + 0(n- 1 / 2 ). 

~oo<x<oo 

If, in addition, Cramer's condition holds, that is, 

limsup\E(e itx )\ < 1, 

\t\— *oo 

then 0{n~ 1 / 2 ) on the right-hand side of (2.3) may be replaced by 0(n~ 1 ). 



4 



P. HALL AND Q. WANG 



We noted in Section 1 that T has a limiting standard normal distribution 
if and only if the distribution of X is in the domain of attraction of the 
normal law and E{X) = 0. Theorem 2.1 argues that L n {x) is a leading term 
in an expansion of the distribution of T. As Theorem 2.2 will show, the 
exact order of magnitude of L n (x) is that of 

, . 5 n = nP(\X\ > b n ) + nb~ l \E{XI{\X\ < b n )}\ 

1 ' +nb^\E{X 3 I(\X\ <b n )}\+nb- 4 E{X 4 I(\X\ <b n )}. 

Theorem 2.2. Assume the distribution of X is in the domain of attrac- 
tion of the normal law and E(X) = 0. Then 5 n — > and 

(2.5) sup |L n (x)|x<5 n 

— OD<X<00 

as n — > oo. Here and below, a n x b n denotes that 

< liminf a n /b n < limsupa n /6 n < oo. 

n >oo n — >00 

Property (2.5) continues to hold if the supremum over all x is replaced by the 
supremum over x 6 {— xq, Xq, x±}, where xq > 3 1 / 2 and x\ is any real number 
not equal to ±xo- Furthermore, if E{\X^) < oo, E(X 2 ) = 1 and E(X 3 ) = 7, 
then 

(2.6) sup |n 1 / 2 L n (x)-| 7 (2x 2 + l)^(x)H0 

— OD<X<00 

as n — ► 00. 



There exist examples of distributions in the domain of attraction of the 
normal law having zero mean and, for which any given one of the four 
components in the definition of 5 n , at (2.4), dominate all the others along 
a subsequence. It follows that none of the terms of which 5 n is composed 
can be dropped if we require a full account of the rate of convergence in 
the central limit theorem. Formula (2.6) shows that in the case of finite 
third moment, the leading term is asymptotic to its conventional form in an 
Edgeworth expansion. 

Together, properties (2.3) and (2.5) give concise results about the rate of 
convergence in the central limit theorem. For example, if X is in the domain 
of attraction of the normal law, and E(X) = 0, then (2.3) and (2.5) imply 
that 

(2.7) sup \P(T<x) -<S>{x)\+n~ 1/2 >z5 n + n~ 1/2 ; 

— oo<x<oo 

and n" 1 / 2 may be replaced by n _1 if Cramer's condition is satisfied. One 
application to which (2.7) can be put is the derivation of characterizations 
of rates of convergence in the central limit theorem. In this regard, some 
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examples can be found in Hall and Wang (2003), on which the present paper 
is based. 

We conclude this section by mentioning that the convergence rate 5 n is the 
same as that in the case of the standard (i.e., non-Studentized) central limit 
theorem, where a sum of independent and identically distributed random 
variables is standardized for scale using b n , but is centered conventionally, 
not using a truncated mean. That is, if we define S\ = 6" 1 J2i<n-X-i, Fj( x ) = 
P(Sj < x) and 

(2.8) L nl (x) = nE{$(x - X/b n ) - $(x)} - ±nb~ 2 <f>' (x) , 

then, provided the distribution of X is in the domain of attraction of the 
normal law and E(X) = 0, it is true that sup_ 0O<a;<oo |L n i(x)| >c 5 n and 

(2.9) sup \F 1 (x)-{<S>(x) + L nl (x)}\=o(5 n ) + 0(n~ 1 / 2 ). 

— oo<a;<oo 

The methods of proof are similar to those given in Chapter 2 of Hall (1982). 
Alternatively, if we put a 2 n = E{X 2 I(\X\ < b n )} and S 2 = (Ei<n^)/(V /2 o"n), 
and define L n 2(x) as at (2.8) but with b n there replaced by n 1//2 <r n , then 
(2.9) continues to hold if we replace (F\,L n i) by (i<2,L n 2). 

The similarities between the Studentized and non-Studentized cases do 
not penetrate deeply, however. The leading terms in the respective settings 
are quite different. In the case of finite third moment, the leading terms are 
asymptotic to their respective Edgeworth forms, which are well known to 
have intrinsically different formulae. 



3. Proofs. 



3.1. Proof of Theorem 2.1. Let a > and define Yi 
p n = nP(\X\ > ab n ), 



XJdXA^aK 



^n{x)=P 



(3.1) 



< x 



M nl (x) = nE{(<f>[x{l + (X/b n ) 2 } 1/2 - (X/bn)] ~ *(x))I(\X\ > ab n )}, 
M n2 {x) = nE{(^[x{l + (X/b n ) 2 } 1/2 - (X/bn)] ~ $(x))I(\X\ < ab n )}. 



Theorem 2.1 is a direct consequence of the following two propositions, which 
will be proved in Sections 3.3 and 3.4. 



Proposition 3.1. Assume the distribution of X is in the domain of 
attraction of the normal law, and E(X) = 0. Then, for each a > 0, 



(3.2) 



sup \P(T < x) - {^ n (x) + M nl (x)}| = o(p n ). 

— oo<x<oo 
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Proposition 3.2. Assume the distribution of X is in the domain of 
attraction of the normal law, and E(X) = 0. Then, for each e > we have, 
for all sufficiently small a > 0, 

(3.3) sup \^! n {x)-mx)+M n2 {x)}\<e5 n + 0{n~ 1 l 2 ). 

— OC<X<OD 

If, in addition, the distribution of X satisfies Cramer's continuity condition, 
then the term 0(n~ l l 2 ) on the right-hand side of (3.3) may be replaced by 

We remark that our method for proving Proposition 3.1 will show clearly 
that the leading-term fragment M n \ derives principally from the largest 
summand among X±, . . . ,X n , that is, from the value X max of Xi for which 
\Xi\ is greatest. Indeed, it may be proved that 

M nl (x) = E{($[x{l + (X max /6 n ) 2 } 1/2 " PWM] - *(x))I(\X max I > ab n )} 

+ 0(p n ), 

uniformly in x. It follows that the leading term L n (x), introduced at (2.2) 
and defined as the limit of M n \ as a — > 0, also has this origin. 

The connections to extremes arise in part through the major role that 
large summands play in convergence properties of series when the distribu- 
tion of the summands has infinite variance. See Darling (1952), Arov and 
Bobrov (1960), Dwass (1966), Hall (1978), LePage, Woodroofe and Zinn 
(1981) and Resnick (1986) for discussion of more conventional settings. In 
the present case the main series where extremes cause difficulty is J2i<n x ? , 
appearing in the definition of T at (2.1). The summands here have finite 
variance if and only if the sampled distribution has finite fourth moment. 
However, extremes arising even from the series J2i< n Xi play a role in the 
leading term and so too in the convergence rate; see Hall (1984) for discus- 
sion of the latter issue. 

3.2. Proof of Theorem 2.2. It is straightforward to show that 5 n — > and 
su P-oo<z<oo \L n {x) \ = 0(5 n ). Therefore, it suffices to prove that 

(3.4) 5 n = olsup\L n (x)\\, 

KxeS ) 

where S = {— xo,xq,xi} is the set of three points in the statement of the 
theorem; and that (2.6) holds. This follows relatively straightforwardly. 

3.3. Proof of Proposition 3. 1 . Put V = maxj< n |Xj | and J = arg max i<n \Xi 
ties may be broken in any measurable way. Define S to be the sign of Xj 
and let T\ = T l i< n x ii T 2 = Y,i< n X h r 3 = Ei<n*i + SVI(V > ab n ) and 
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T i = Y.i<n Y i + V 2 I(V > ab n ). The probability that two or more values 
of \Xi\, for 1 < i < n, exceed ab n equals 0(p 2 l ). Therefore, P{{T\,T2) = 
(T-^jT^)} = 1 — 0(p 2 l ), whence it follows that, uniformly in x, 



(3.5) 



P(T< I ) = P{ (T2 _X ?)1/2 < I 



Put tt(u) = P{X > 0\\X\ = v). Conditional on X u ...,X n , let S{V) denote 
a random variable that takes the values +1 and —1 with probabilities tt(V) 
and 1 - n(V), respectively. Let T 5 = J2i<n Y i + S(V)VI(V > ab n ). Then 
{T^,T^) has the same joint distribution as (T^jT^), and so by (3.5) we have, 
uniformly in x, 

(3.6) P{T <x) = P{W <x) + 0{p 2 n ), 

where W = T 5 /(T 4 - n -1 !]? )V 2 . 
Define 

i<n i<n 

v = E{XI(\X\ < ab n )}, t 2 = E{X 2 I(\X\ < ab n )}. 
Note that a formula for ^f n (x), equivalent to (3.1), is 

(3.7) tt w ( a ) = p[ Ty XZ T mi/2 ^ x 

Let the random variable Aq have the standard normal distribution. The 
joint distribution of the vector (b~ l Ty ,b~ 2 Ts) , conditional on V > e6 n , con- 
verges to the joint distribution of (i\q,0). In particular, the second compo- 
nent of the limiting distribution is degenerate at 0. The convergence has the 
following property: for all e > 0, 

(3.8) sup sup \P{b~ x T Y <x;b~ 2 \T B \ <e\V = v) - P{N X <x)\ 0. 

v>ab n — oo<x<oo 

For a formal proof of (3.8), it suffices to observe that the joint distribution 
°f (J2i<n Y i'J2i<n Y i)i conditional on V = v > ab n , equals the unconditional 
joint distribution of (X)i<n-i *i. £««-!*?); and that b^Y. i<n-l 
Aq in distribution, b~ 2 Y.i< n -ii Y i ~ EY ?) ^ in probability and 6" 1 |^7(l^i) | + 
b~ 2 E{Y 2 ) - 0. 

Since T A = T B + m 2 + V 2 I(V > ab n ), T 5 = T Y + nu + S(V)VI(V > ab n ), 
b~ 2 nr 2 — ► 1 and b~ x nv — > 0, then, for all e > 0, we have from (3.8), 

sup sup \P[b- x {Ts - S(V)(V/b n )} < x- 

v>ab n —oo<x<oo 

\b~ 2 T A - 1 - {V/b n f\ < e\V = v]- P(Aq < x)| -» 0. 
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Therefore, if N2 denotes a standard normal random variable that is inde- 
pendent of V, then 



sup sup 

v>ab n —oo<x<oo 



P(W <x\V = v) - P 



N 2 + S(V)(V/b n ) 

{i + (y/6 n ) 2 } 1/2 



< x 



V = v 



0. 



Equivalently, 

P(W < X\V = v)~ $[x{l + {v/bnf} 1 ' 2 - S(v)( V /b n )] -> 0, 

uniformly in v > ab n and — 00 < x < 00. Multiply throughout by dF n (v), 
where F n denotes the distribution function of V conditional on V > ab n ; 
integrate over v > ab n ; and then multiply by P(V > ab n ), to prove that, 
uniformly in x, 

P(W<x;V>ab n ) 
(3.9) = E($[x{l + {V/b n f} 1 ' 2 - S(V){V/b n )]I{V > ab n )) + o(p n ) 
= nE($[x{l + {X/b n ) 2 } 1 / 2 - (X/b n )]I(X > ab n )) + o( Pn ). 

To derive the last identity, reformulate the expectation using an integration 
by parts argument, and note that 

P{S(V)V >y} = nP{X > y) + 0[{nP(X > y)} 2 }, 
P{S(V)V <y} = nP(X <y) + 0[{nP{X < y)} 2 }, 

where both remainders are of the stated orders uniformly in y > ab n and 
y < —ab n , respectively. 
Furthermore, 



E{P(W < x\V < ab n )I(V < ab n )} 

T B + nv 



E I 



{T Y + nr 2 - n- x {T B + nv) 2 } 1 / 2 
T B + nv 



< x 



I(V < ab n ] 



V n (x)-E[I 



{T Y + nr 2 - n~ x (T B + nv) 2 } 1 / 2 



< x 



I(V > ab n 



using (3.7) to obtain the last identity. A simpler version of the argument 
leading to (3.9) may be used to prove that the subtracted term above equals 
$>(x)p n + o(p n ), uniformly in x. Therefore, 



(3.10) 



P(W <X;V< ab n ) = tfn(x) - ®{x)pn + O(pn), 



uniformly in x. Combining (3.10) with (3.6) and (3.9), we conclude that 
(3.2) holds. 
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3.4. Proof of Proposition 3.2. Define S n = S* = V 2 = 

J2j X 2 and V* 2 = J2j yf- It is well known [e.g., Efron (1969)] that, for x > 0, 

(3.11) * n (x) = P[S*JV: < x{n/{n + x 2 )} 1 ' 2 }. 
Noting also that for \u\ < 1, 

sup \<S>{x{l+u 2 ) 1 / 2 - u] 

— oo<x<oo 

- [&(x) + {-u+ \u\2x 2 + 1) + ±u A x{x 2 - 3)}0(x)] I < C\u\\ 

where C is an absolute constant, and that nE{\X/b n \ 5 I(\X\ < ab n )} < a8 n , 
we have that, for any a > 0, 

(3.12) sup \M n2 (x) - Qm(x)\ < Ca5 n , 

— oo<x<oo 

where u nj = nE{(X/ B n )i I(\ x \< abn j} and 

Qni(x) = -u nl (f>(x) + u n3 ±(2x 2 + l)4>(x) + u n 4^x(x 2 - 3)(j)(x). 

In view of (3.11) and (3.12), Proposition 3.2 will follow if, for each e > 0, we 
have for all sufficiently small a > 0, 

(3.13) sup \P(S*jV:<x)-{<Z>(x)+Q nl (x)}\<s5 n + 0(n- 1 / 2 ), 

— CXD<X<00 

and 0{n~ l l 2 ) may be replaced by 0{n~ l ) if Cramer's condition is satisfied. 

Without loss of generality, x > 0. Since the distribution of X is in the do- 
main of attraction of the normal law, then {S n /V n } is stochastically bounded 
[see, e.g., Gine, Gotze and Mason (1997)] and similarly {S*/V*} is also 
stochastically bounded. Hence, by Theorem 2.5 of Gine, Gotze and Mason 
(1997), for x>5n 1/12 , 

P(S* > xV:) < e~ x sup E{eM\S*jV:\)} < AexpM" 1 / 12 ) < A5 2 . 

n 

(Here and below, A denotes a positive constant which might be different 
at each appearance.) Moreover, |1 — $(x) — Q n \{x)\ < Ab 2 t uniformly in 

— 1/12 

x > 5 n • Therefore, (3.13) will follow if, for each e > 0, we have for all 
sufficiently small a > 0, 

(3.14) sup'\P(S* n /V: <x)- {$(x) + Q nX {x)}\ < eb n + 0{n~ l l 2 ), 

and 0{n~ 1 / 2 ) may be replaced by 0(n~ 1 ) if Cramer's condition is satisfied, 
where sup' denotes the supremum over x E [O,^ 1 ^ 12 ]. 

Let B 2 n = nEY 2 and W n = B~ 2 £j(^/ - EY 2 ). Noting that (1 + y) 1 / 2 = 
1 + T^y — |y 2 + jqU 3 + 0y A , where 9 = 0(y) satisfies \6\ < jg for |y| < |, we 
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may prove that 

p(s*jv:<x) 

= P{S*<xB n (l + W n ) 1 / 2 } 

> -P(\W n \ > |) + P{S* n < xB n (l + \W n - \WI + ±Wl - ^W n 4 )}, 

p(s:jv:< X ) 

= P{S* n <xB n (l + W n ) 1 / 2 } 

< P(\W n \ > |) + P{S* n < xB n (l + \W n - \W 2 + + ±W*)}. 

In view of Markov's inequality, it is readily seen that, for each e > 0, we have 
for any sufficiently small a > and all sufficiently large n, 

P(\W n \ > 1/2) < 16£(W 4 ) < A(a 4 5 n + 5 2 n ) < sS n . 

Hence, (3.14) will follow if we prove that, for each e > 0, we have for \6\ < 
1/16 and any sufficiently small a > 0, 

sup'|p{s; < xB n (i + \w n - \wl + ±wl + ew*)} 

(3.15) - {$(x) + Q nl (x)}\ 

■Cedn + Oin- 1 ' 2 ), 

and 0(n~ l l 2 ) may be replaced by (^(n" 1 ) if Cramer's condition is satisfied. 

Let Y^j^ki J2j^k^i an d J2j^k^i^m denote summations over pairs, triples 
and quadruples, respectively, of distinct integers between 1 and n. Put Zj = 
Y 2 — EY 2 . Simple calculations show that 

n 

BX = Y / Z^ + 3Y / Z,(Z 2 -EZ 2 )+ £ ZjZkZl + W nl , 

3=1 j^k 
n 

BX = E Z t + 4 E Z i( Z k - EZl) + 12 E(^I - EZ 2 )(Z 2 - EZ 2 ) + W n2 , 

3=1 j^k j^k 

where W nl = 3(n - 1)E(Z 2 ) ]T\ Zj and 

n n 

W n2 = 4(n - 1)£(^ 3 ) E + 24 (™ " l ) E ( z l) Y.(Z 2 j - EZ 2 ) 

3=1 3=1 

+ 12n(n - \){EZ\f + 24 ]T ^fc^ 2 + E Z 3 Z k Z { Z m . 

Therefore, 

P\si < xB n (l + i W n - i^„ 2 + + 0<) | 



CONVERGENCE RATE FOR STUDENT'S T STATISTIC 11 

{1 n X X 

n 3=1 n 3+k °n jjtkjii 

<x(l + W n3 )-^^V 

o n J 

where &(x) = r, j (x)-Er ]j (x), = -±ZjZ k Z u W n3 = ^B~ & W nl + 9B- % W n2 , 

Vj{x) = Yj ~ 2^ + 8^3^ " 1^5 Z J ~ ^Jh 
1 3 AO 

fjk = g Z J Z k - Y^2 Z i( Z k ~ EZ k) ~ ^ Z j( Z k - EZ k) 

-^l-EZ^Zl-EZl). 

n 

It is readily seen that 

E(B~ 6 W nl ) 4 < A5 n (a 4 5 n + 5 2 n ) and E(B~ 8 W n2 ) 2 < A5 n (a 2 5 n + 5 2 n ). 

Hence, for each e > 0, we have for any sufficiently small a > and all suffi- 
ciently large n, 

P(\W n3 \>2ed n )<P(\B- 6 W nl \>e5 n ) + P(\B- 8 W n 2\>e5 n ) 

< A{e-\a A 5 n + S 2 ) + e- 2 (a 2 5 n + 5 2 n )} < eS n . 

Result (3.15) now follows easily from the following three propositions. We 
will only prove Propositions 3.3 and 3.4 in subsequent sections. The proof of 
Proposition 3.5 is relatively straightforward although requiring tedious alge- 
bra, and hence details are omitted. The proof of Proposition 3.2 is therefore 
complete. 

Proposition 3.3. For all0<a<~, 



sup ' sup 

— oo<i/<oo 



p \ it E + -si I>j* + "Be E ^ y 

nn j=l ° n 3+k ° n 3+k^l J 

(3.16) 

-{Hy) + £ n (y)} 



= o(5 n )+0(n~ 1 / 2 ), 
where C n {y) = n[E<5>{y - ^(x)/B n } - $(y)] - \& 2 \y). 

Proposition 3.4. If limsupu|_ >0O \Ee itx \ < 1, then the term 0(n~ l l 2 ) 
on the right-hand side of (3.16) may be replaced by 0(n~ l ). 
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Proposition 3.5. For each e > 0, we have for any sufficiently small 
a > 0, 

(3.17) swp'\$[x - {nE m (x)/B n }} - $(x) + Qra(x)\ < e5 n + 0{n~ l ), 

(3.18) sup'|£ n [x - {nE m (x)/B n }} - Q n3 (x)\ < e5 n + 0{n~ l ), 
where u n j = nE{(X/B n y I(\X\ < ab n )}, Q n 2{x) = u n i4>(x) + \xu n4 (j){x) and 

Q n3 (x) = u n3 i(2x 2 + l)4>(x) + u n4 ^x(2x 2 - 3)4>{x). 

3.5. Proof of Proposition 3.3. Standard methods based on Taylor's ex- 
pansion, although requiring tedious algebra, may be used to establish the 
following lemmas. Define 

u nj = nE{(X/B n yi(\X\ <ab n )}, g(t,x) = E[exp{U^(x) / B n }\ 

and 

f n (t, x) = e~ t2 / 2 [l + n{g(t, x)-l} + \t 2 \. 

Lemma 3.6. If < a < \, then for all sufficiently large n, 

(3.19) \nB- 2 Eg(x) - (1 - xu n3 + \x 2 u n4 )\ < 2(1 + x 2 )(a5 n + n" 1 ), 

(3.20) \nB- 3 Eg(x) - {u n3 - |xu n4 )| < 12(1 + \x\ 3 )(a5 n + n" 1 ), 

(3.21) \nB- A E£t(x) -u n4 \ < 32(1 + x i ){aS n + rT 1 ), 

(3.22) nB~ 5 E\Ci(x)\ 5 < 32(1 + \x\ 5 )a5 n . 

Lemma 3.7. There exists a constant cq > such that, for all a 6 (0, 
\t\ < Con 1 / 2 , all x G [O,^ 1 ^ 12 ] and all sufficiently large n, 

(3.23) \g(t,x)\<e- t2 / 8n , 

(3.24) \g n (t, x) - e~ t2 / 2 \ < A(l + x 4 )(l + a~ l )5 n (t 2 + t 4 )e^ 2 / 8 , 

(3.25) | 5 n (i, x) - f n (t, x)\ < {A(l + x 8 )(l + a- 2 )5 2 n (t 4 + t 8 ) + 2n- 1 t 4 }e"* 2 / 8 . 

Throughout the proof of Proposition 3.3, we assume that < a < tj, 
< x < o n and n is sufficiently large. Define (fjk = fjk + fkj, T n = 
B n l Hjij( x ) and 



m— 1 ra g m—2 n n 

« j=l fc=j+l n j=l k=j+ll=k+l 

Noting that B 2 = nEY 2 = b\ for sufficiently large n, we obtain that \Yj\ < 
B n /2 and \ZA < B 2 J2. Using these properties, E(tp 12 s\Xj) = 0, j = 1,2,3, 
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and E(tp V3 \X j ) = 0,j = 1, 2, we may deduce that E{y\ 2 ) < 2 8 (EYf) 2 , E(tjj 2 23 ) < 
(EY±) 3 and, for 1 < m < n, 

E(Al m ) < 2x 2 {mnB- 8 E(<p 2 12 ) + mn 2 B- 12 E^ 2 23 )} 
[6 - 2b) <2 10 mn- 1 x 2 5l 

the last inequality following from the fact that nB~^EY^ < 5 n . Moreover, 
noting that n~ l < S n — > 0, |u n 4| < 5 n and \u n z \ < (l + a~ 1 )5 n , it follows easily 
from (3.19)-(3.21) that 



(3.27) W-l 

(3.28) nB^\E£l(x)\ < 32(1 + |x| 3 )(l + a" 1 )^,, 

(3.29) nB~ 4 E£t (x) < 65(1 + x 4 )5 n . 

We now turn back to the proof of (3.16). Using the identities 

A n,n = Tj4 Vjk + ~Es ^jkl 

and / e lty d{§(y) + C n (y)} = f n (t,x), and Esseen's smoothing lemma [e.g., 
Petrov (1975), page 109], it may be shown that 

sup \P(T n + A n , n <y)-{$(y)+£ n (y)}\ 

— oo<j/<oo 

< / , \Eexp{it{T n + A n>n )} - U^x)^- 1 dt 

J\t\<rain{6~ 2 fion 1 / 2 } 

(3.30) +A{6 2 +n -l/2 ) gup \ {d/dy){Hy)+Cn (y)}\ 

— oo<j/<oo 

< J2 hn + A(% + n- l ' 2 ){l + a" 1 ^), 

where Co is as in Lemma 3.7, 

hn= \E exp{it(T n + A nn )} 

J\t\<Sn ' 

— E exp(itT n ) — itE{A n ^ n ex.-p(itT n )}\\t\~ 1 dt, 
2\Eex V {itT n ) - f n (t,x)\\t\~ l dt 

\t\<5- 1/4 

+ / 1/4 {EexpiitT^Wtl^dt, 
hn= , M |-E{A nn exp(ziT n )}| dt, 

-/|<l<<5,; 1/4 

hn= [ , M , iSexp^^ + A^)}]^- 1 ^, 



<2(l + x 2 )(l + a- 1 )<5 n <i 



<2n 
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and we have used the property, implied by (3.27)-(3.29), that 



sup \C n (y)\<A 

-oo<j/<oo 



nE£l(x) 



Bl 



+ n\Ej\(x)\ + nEi\{x) 



2ABt 



< A(l + x 4 )(l + a-^Sn < A(l + a" 1 )^/ 3 . 
Using (3.26) and the fact that \e tu — 1 — iu\ < u 2 /2, it can be shown that 



(3.31) 



hn < 2 



|t|<<5„ 



1/4 



E(AlJ\t\dt<2W n /2 <2% /3 - 



Using Lemma 3.7, we obtain 

(3.32) I 2n < A{(1 + x 8 )(l + a~ 2 )5 2 n + n" 1 } < A{(1 + a~ 2 )^ /3 + n" 1 }- 

Next we estimate I^n and I^ n . Treating the former first, note that E(ip\ 2 \X{) 
E{ip\ 2 \X 2 ) = and Eip\ 2 < 2 8 (.EY 1 4 ) 2 , and that as in Bickel, Gotze and van 
Zwet (1986), 

t 2 

(3.33) E(ip l2 exp[it{^(x) + ^ 2 (x)}/B n ]) = —-^E{^(x)C2(x)ipi 2 } + h(x), 

n 

where by using \e iu - 1 -iu\ < u 2 /2, \e iu - 1| < \u\, (3.27) and (3.29), 



IM*)[< 



E 



+ 



^12 i 

1*1 



it( 1 (x)/B n 



B n 



B n 



E 



1 



B n 



< 



2BI 



< %{Ert 2 )^{Eg{ X )}V*{E£{ X )}V 2 

n 

<A(l + x 2 )\t\ 3 n- 1 / 2 B-\EY*)(EY 2 ) 1 / 2 6 1 n / 2 
<A(l + x 2 )\tfn- 2 5 n / 2 B n , 

since nB~^EY^ < 5 n and B 2 = nEY 2 . Tedious but elementary calculation 
shows that 

\E{i 1 {x)i 2 {x)<p 12 }\<A{l + x 2 )n- 2 8 2 n Bl. 
Substituting into (3.33), we deduce that 

(3.34) |£^ 12 exp[rf{£i(x) + £ 2 ( x ) }/£?„]) | < A(l + x 2 )(t 2 + \t\ 3 )n~ 2 5 3 J 2 B n . 

Similarly, it follows from the identities E(ipi 2 ^\Xj) = 0, for j = 1,2,3, and 
from E^ 2 23 < ( EY if and ( 3 - 27 )> that 

(3.35) |S(^i23 exp[it{^(x) + £ 2 (x) + &(x)}/B n ])\ < A\t\ 3 rr 3 8% 2 B*. 
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From (3.34), (3.35) and (3.23), it can be seen that 
\E{A njn exp(itT n )}\ 

< \x\n 2 B- A \E{^ 12 exp(itT n )}| + |x|n 3 B- 6 |£{^ 123 exp(iiT n )}| 

<A(l + |x| 3 )^/2 (t 2 + |i|3 )e -^/8 ; 

and hence 

(3.36) h n = [ \E{A n>n exp(itT n )}\dt < A{1 + \x\ 3 )5 3 J 2 <A5 5 J 4 . 

We next estimate I^ n . Put A* m = A n>n — A n>m . In view of (3.26), 
\Eexp{it(T n + A„, n )} - Eexp{it(T n + A* m )} - itEA n>m exp{it(T n + A* m )}| 

< 2 9 t 2 x 2 mn- l 5 2 n . 

This inequality, together with the independence of the X^s, implies that for 
any 1 < m < n, 

C\ "\7\ l^exp{it(T n + A„,„)}| 

1 ' < \g(t, x)\ m - 2 + A\x\5 n \t\\g(t,x)\ m - 5 + At 2 x 2 mn~ 1 5 2 l , 

where we have used the bound i?|A niTO | < (E^n^ 2 ) 1 / 2 < A\x\8 n . 

Let no = [16nt -2 log((5~ 1 )] + 5, where [•] denotes the integer part function. 

It is clear that 1 < no < n for 5 n 1 ^ < \t\ < min{<5~ 2 , cqti 1 ^ 2 }, for n large 
enough. Hence, choosing m = uq in (3.37) and using (3.23), we get 

,„„ a x hn= I 1/4 „ lEexp^i^ + A^)}!^- 1 ^ 

(3.38) J5 i ; 1/4 <|t|<min{5- 2 ,con 1 /2} 

Substituting the bounds for Ii n , . . . , I± n back into (3.30), and recalling that 
5 n — ► 0, we obtain (3.16), and hence complete the proof of Proposition 3.3. 

3.6. Proof of Proposition 3.4. Without loss of generality we assume that 
$n < ra -1 / 3 . Indeed, for 5 n > n -1 / 3 , it is obvious that the term 0(n~ 1 / 2 ) on 
the right-hand side of (2.3) can be replaced by 0{n~ l ). Note that 5 n < n _1//3 
implies that nP(|X| > b n ) < n -1 / 3 . This, together with the fact that the 
distribution of X is in the domain of attraction of a normal law, implies 
that EX 2 < oo. 

We continue to use the notation in the proof of Proposition 3.3. Further, 
we put 

x m n Gx m n n 

n j=lk=m+l n j=lk=m+ll=k+l 

^ n—l n n—2 n n 

^ ( S m = ^ X Vik + -^ X X 

" j=m+l k=j+l n j=m+l k=j+l l=k+l 
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— 1/12 

As in the proof of (3.26), we have that, for < x < 5 n , 
(3.39) E(A n>n - AW m - Agj 2 < 2 10 m 2 n- 2 x 2 5l < Am 2 n- 2 5* 1/6 '. 
Hence, for mo = Clogn, where C is a constant that we shall specify later, 
P{\A n , n - AW - Ag> | > n" 1 ) < ^(logn) 2 ^ 6 < AC 2 8n^ 2 . 



Proposition 3.4 will now follow if we show that, for < x < 5 n 



1/12 



sup \P(T n + A£l + A%) mo <y)-{<!>(y)+£ n (y)}\ 

(3.40) -oo<j/<oo 

= o(5 n ) + 0(n" 1 ). 

Throughout the proof of Proposition 3.4, we assume that < a < |, < 

— 1/12 

x < 5 n and n is sufficiently large. We need the following lemma, the 
proof of which can be found in Prawitz (1972). See also Bentkus, Gotze and 
van Zwet (1997). 

Lemma 3.8. Let F be a distribution function with characteristic func- 
tion f . Then for all y € R and T > 0, it holds that 

(3.41) limF(z) < \ + P.V. ( T exp(-iyt)T~ l K(t/T)f(t) dt, 
zly J-T 

(3.42) limF(z) > \ - P.V. F exp(-iyt)T- l K(-t/T)f(t) dt, 
Ay J-T 

where 



P.V. 



lim( 



+ 



-T hiO \J —T 

and 2K(s) = K 1 (s) + iK 2 (s)/(irs), 

Ki(s) = 1 — \s\, ^(s) = 7TS(1 — \s\) COt ITS + \s\ for \s\ < 1, 

and K{s) = /or \s\ > 1. 

We shall give the proof of (3.40) by using Lemma 3.8 and some of the tech- 
niques of Bentkus, Gotze and van Zwet (1997). By E^{-) = E(-\Xk+i, . . . , X n ) 
we shall denote expectation conditional on Xk + i, . . . ,X n . Define 



Tl = n y2 § -2/3 Ei 



n/2 

#4 

n fc=mo+l 



r 2 = n^ 2 5- 2 ^E mo 



B 4 



k=n/2+l 



and put To = 1 — limsup^i^oo \Ee |, 

1/2 r-2/3 



16(1 +T1 +T 2 )' 
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As in the proof of (3.26), 

{ \ n n k =m +l I \ n k=n/2+l ) ) 

< Ax 2 5 2 / 3 . 

— 1/12 

This, together with the bound < x < S n , implies that 

(3.43) E(H~ 2 ) < An^S^Eil + n + r 2 ) 2 < An^S^/ 3 . 

Also, we have that H < (r /16)n 1 / 2 ^ 2/3 . 

Returning to the proof of (3.40), note that H depends only on X mo+ i, . . . , X, 
Using (3.41), and arguing as in Bentkus, Gotze and van Zwet (1997), we ob- 
tain 

(3.44) 2P(T n + AiX + A^ mo <y)<l + EI 1 + EI 2 , 
where, with f(t) = E mo exp[it{T n + A^; mo 

h=H~ l [ expHyiJtfi (*/#)/(*)#, 

JR 

J 2 = -RV. / exp(-iyt)K 2 {t/H)f(t)t~ 1 dt. 
vr Jr 

The following results are derived by Hall and Wang (2003), on which the 
present paper is based: 

(3.45) \EI 1 \ = o(5 n ) + 0(n- 1 ), 

(3.46) \El 2 + 1 - 2{$(y) + C n (y)}\ = o(5 n ) + 0(n~ r ). 

It follows from (3.44)-(3.46) that 

P(T n + A^ mo + A^ mo <V)< Hv) + C n (y) + o(S n ) + 0{n~ l ). 
Similarly, using (3.42) and symmetry arguments, one can show that 

P(T n + A^ mo + Ag) mo > y) < 1 - {<%) + C n {y)} + o(5 n ) + 0{n~ l ). 
Result (3.40) now follows, and hence the proof of Proposition 3.4 is complete. 
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